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Abstract. A sequence S = S1S2 . . . s n is nonrepetitive if no two ad- 
jacent blocks of S are identical. In 1906 Thue proved that there exist 
arbitrarily long nonrepetitive sequences over 3-element set of symbols. 
We study a generalization of nonrepetitive sequences involving arith- 
metic progressions. We prove that for every k ^ 1, there exist arbitrar- 
ily long sequences over at most 2k + lOy/k symbols whose subsequences, 
indexed by arithmetic progressions with common differences from the 
set {1,2, ... ,k}, are nonrepetitive. This improves a previous bound 
obtained in [9]. Our approach is based on a technique introduced re- 
cently in [12] . which was originally inspired by a constructive proof of 
the Lovasz Local Lemma due to Moser and Tardos [15] . We also discuss 
some related problems that can be attacked by this method. 



1. Introduction 

For a sequence S = s±S2 ■ ■ ■ s n a repetition of size h is a block (subsequence 
of consequtive terms) of the form XX = x\ . . . XhX\ . . . Xh- A sequence is 
nonrepetitive if it does not contain a repetition of any size h 1. For 
example, the sequence 1231312 contains a repetition 3131 of size two, while 
123132123 is nonrepetitive. 

It is easy to see that the longest nonrepetitive sequence, which can be 
constructed over a set of two symbols, has length three. In 1906 Thue [18 
proved, by a remarkable inductive construction, that there exist arbitrarily 
long nonrepetitive sequences over just three different symbols (see also [5], 
[1]). This discovery resulted in many unexpected applications inspiring a 
stream of research and leading to the emergence of new branches of math- 
ematics with a variety of challenging open problems (see [I], [3J, [8], [TT] . 



One particular variant, proposed in [3J, concerns nonrepetitive tilings, 
i.e., assignments of symbols to lattice points of the plane so that all lines in 
prescribed directions are nonrepetitive. This idea led Currie and Simpson 
[7] to consider sequences with a stronger property: all subsequences taken 
over arithmetic progressions of bounded common differences are nonrepet- 
itive. Let k ^ 1 be a fixed positive integer and let S(k) be the family of 
subsequences of S of the form SiSi +( iSi + 2d ■ ■ ■ Si+td with d G {1,2,..., k}, 
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1 ^ i ^ k — 1, t = [n/d\. If every element of S(k) is a nonrepetitive 
sequence, then S is called nonrepetitive up to mod k (see [7]). Let M(/c) 
denote the minimal number of symbols needed to create arbitrarily long se- 
quences nonrepetitive up to mod k. Thue's theorem can be rephrased as 
M(l) = 3. It is easy to see that M(k) ^ k + 2 for every k ^ 1, and one may 
suspect that equality always holds. 

Conjecture 1. M(k) = k + 2 for every k ^ 1. 

This conjecture has been confirmed so far only for k = 2, 3, and 5 (it 
is not even known for k = 4) by providing Thue type constructions of the 
desired sequences. However, using the Lovasz Local Lemma (see [2]) it was 
proved in [9] that M(k) ^ e 33 k for any k. In this paper we improve the last 
bound substantially by proving that M(k) ^ 2k + 0{sfk). Our method is 
inspired by the recent constructive proof of the Lovasz Local Lemma due 
to Moser and Tardos [15J. Just like in the related paper [T2] (and in the 
original nonconstructive approach), we prove the result for a more general 
version where symbols are chosen from prescribed lists (sets) assigned to the 
positions in a sequence. The same method applies in the case when K is any 
/c-element set positive integers, and we want to construct arbitrarily long 
sequence with no repetitions on arithmetic subsequences with differences 
from K. 

2. The algorithm 

We present an algorithm that generates consecutive terms of a sequence S 
by choosing symbols at random (uniformly and independently), and every 
time a repetition occurs, it erases the longest repeated block and contin- 
ues from the smallest unassigned position. We alway erase the block that 
contains the last chosen element in order to ensure that after this removal 
the remaining sequence stays nonrepetitive. In the listing of the algorithms 
value of some Sj means that no symbol is assigned to Sj. Initially all Sj 
equals 0. 

We show that for any given positive integer n, and arbitrary lists of sym- 
bols Li, each of size at least 2k + 10y/k, the Algorithm computes a sequence 
of length n which is nonrepetitive up to mod k. Random elements in line 
(3) of the Algorithm are chosen independently with uniform distribution. 
The general idea is to prove that the Algorithm cannot work forever for all 
possible evaluations of the random experiments. It is easy to see that the 
Algorithm stops only if nonrepetitive up to mod k sequence is constructed. 

Theorem 1. For every positive integer n, and for every sequence of sets 
Li, . . . , L n , each of size at least 2k + 10^/k, there is a sequence S = s± . . . s n 
nonrepetitive up to mod k such that S{ £ Li for every % = 1, 2, . . . , n. 

Proof. Let us suppose for a contradiction that such sequence does not exists. 
It means that the Algorithm never stops. We are going to count the possible 
sequences of random values used in line (3) of the algorithm in two ways. 

Let Tj, 1 ^ j ^ M, be a sequence of values chosen in the line (3) in the 
first M choices of some run of the Algorithm . Each r,- can take at least 
lO^/k values. It means that there are at least (10k) M / 2 such sequences. 
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Algorithm 1 Choosing a sequence which is non-repetitive up to mod k. 
1: i «- 1 

2: while i ^ n do 

3: Sj «- random element of Lj\{sj_ fc , Sj_ fe+ i, Sj_i, s i+ i, Sj +fe } 
4: if si, s n is non-repetitive with respect to non-zero elements then 
5: i <r- smallest index j for which Sj = 
6: else 

7: from the set of the longest repetitions in S choose 

Sj—2h-d+d) ■■•) Sj—h-dt Sj~h-d+di •■•i Sj 

with the largest index of the first element j — 2h • d + d 

8: if i ^ j — h ■ d then 

9: m^j-2h-d + d 

10: else 

11: m-^j — h-d + d 

12: end if 

13: for j = 1 to h do 

14: S m <- 

15: m ^ m + d 

16: end for 

17: i smallest index j for which Sj = 
18: end if 
19: end while 



The second way of counting involves descriptions of the behaviour of the 
Algorithm. For every fixed evaluation of the first M random choices we 
define the following five elements: 

• A route R on the upper right quadrant of a grid Z x Z from coordinate 
(0,0) to coordinate (2M, 0) on 2M steps with possible moves (1, 1) 
and (1, —1) which never goes below the axis y = 0. 

• A sequence D of numbers between 1 and k corresponding to the 
peaks on the route R, where by a peak we mean a move (1, 1) followed 
immediately by a move (1,-1). 

• A sequence O of numbers —1, or 1 corresponding to the peaks on 
route R. 

• A sequence P of integers, one for every peak, whose sum is not grater 
than M. 

• A sequence S produced by the Algorithm after M steps. 

A pentad (R, D, O, P, S) will be called a log. We encode consecutive steps 
of the Algorithm into log in the following way: 

Each time the algorithm executes line (3) we append a move (1,1) to 
the route R and for every execution of line (14) we append (1,-1). Notice 
that in line (14) the algorithm can set zero only to s c which are non-zero, 
therefore the number of down-steps on route R never excess the number of 
up-steps, and it never goes below axis y = 0. At the end of computations 
we add to the route R one down-step for each element of S which is non- 
zero. This brings us to the point (2M, 0). Whenever Algorithm 1 executes 
line (7) we append to the sequence D a difference d of the chosen longest 
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repetition. Then, if (8) is true, we append 1 to the sequence O, otherwise 
we append —1 . For each execution of the loop (13)-(16) we append to the 
the sequence P, value j for which m equals i in the loop. Finally, S is the 
sequence produced by the Algorithm after M executions of line (3). 

Claim 1. Every log corresponds to a unique sequence rj,l^j^Mof the 
first M values chosen in the line (3) in some execution of the Algorithm. 

Proof. For a given log (R, D, O, P, S) we are going to decode r±, . . . , tm- At 
first we use information from route R and sequences D and P to determine 
which Si were non-zero at each step of the algorithm and to find coordinates 
of elements which were zeroed at step (14) of the Algorithm. Notice that 
each operation of setting a non-zero value to some Sj corresponds to the 
up-step (1, 1) on the route R, while each zeroing of Sj corresponds to some 
down-step (1, —1) on route R. We examine the route R from the point (0, 0) 
to the point (2M, 0). Assume that the first peak occurs after jth step. Since 
this is the first time we erase some elements s,-, we know that 
the only non-zero elements at this point. Now we use information encoded 
in D and P. We look at the number of consecutive down-steps on R (which 
in this case is equal to p\) and remember that for this peak we zeroed Sj, 
s j-dn s j-2du • ■ • > s j~(p 1 -i)d 1 - Then again each up-step on R denote setting 
some value to the zeroed position with the smallest index i. Proceeding in 
that way we know exactly which position was set last, when we reach the 
next peak. From the number of consecutive down-steps on R we deduce the 
length of the zeroed repeated block. Value in the sequence D corresponding 
to the peak denotes the difference of the arithmetic subsequence in which 
the repetition occured. Finally corresponding value from the sequence P 
describe the position of the symbol just set, within the erased repated blocks. 
From all this information it is easy to deduce which positions was zeroed as 
a result of erasing the repetition. We repeat these operations until we get 
to the end of R. 

After this preparatory step we are ready to decode r± , . . . , tm ■ We con- 
sider the sequence R in reverse order - from the last point (2M, 0) to the 
first (0, 0) modifying the final sequence S. This time we use information 
encoded in S and O, and the knowledge determined in a preparatory step. 
As we said before, each up-step (1, 1) on the route R corresponds to some r^. 
For every such up-step we have already determined the indices of elements 
Ti on S in the preparatory analysis. At the beginning, going backward on 
R, there is some number of down-steps corresponding to non-zero elements 
of S (the elements added at the end of computations). We skip them and 
move on. Then, each time there is an up-step on R, we assign to rj a value 
from appropriate Sj (where i was determined in the preparatory step), and 
set Si to 0. In fact, to determine the real outcome of random experiments 
(i.e. an index of the chosen element on the list of elements available at this 
step), we must take into account the forbidden symbols from k preceeding 
and k following places on S. Every consecutive sequence of t down-steps 
on R correspond to erasure of some repeated block during the exectution 
of the Algorithm. Then we assign to Sj, Si +dl , . . . , s i+ ( t _x} dl corresponding 
values s i+0ltdn s i+dl+0[tdl ,...,s i+{t _ 1)dl+0ltdl (where Si is the first element 
of the ereased repeated block determined in the preparatory step). These 
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are exactly the values from the repetitions erased at step (17) of the Algo- 
rithm. □ 

We showed that each sequence of randomly chosen values during the ex- 
ecution of the Algorithms corresponds to some log, and that this mapping 
is injective. This implies that the number of different logs is always greater 
than or equal to the number of feasible sequence r±, . . . , ry. Let L be the 
size of the set of all possible logs. To calculate L we have to determine 
the number of different structures for each element in a log. The number 
of all possible routes on the upper right quadrant of a grid of length 2M 
with possible moves (1, 1) and (1, —1) is well known to be the Mth Catalan 
number Cm- Since in every choice in line (3) the elements occurring within 
the distance k are excluded, the Algorithm can not produce repeated block 
of length k. It means that the subsequence (1,1), (1,-1), (1,1) can not oc- 
cur in the route R. Therefore the number of peaks within R cannot exceed 
M/2. Thus there can be at most k M ' 2 possible sequences D. Respectively, 
there are at most 2 M I 2 possible evaluations for sequence O. 

The sequence S consists of n elements of value between and 2k + 10\/&, 
which gives us (2fc+10v / ^) n possible evaluations for this sequence. For every 
fixed route R with m peaks corresponding to the repeated block of lengths 
pi, . . . ,p m we have at most P1P2 ■ ■ - p n sequences which can occurr as P. 
Therefore for the upper bound for the number sequences P we determine 
maximum value of the product P1P2 ■ ■ ■ p n with p\ + . . . + p n = M. The 
inequality between the arithmetic and geometric means implies that the 
maximum is obtained when all pi are the same. Denote their common value 
by x. Then we must determine max (x^^J- Since 

/ ma' m ( M M\og(x)\ 

M =XX 

we get that the maximum value is obtained with x = e and equals ss 
1.44467 M < 1.5 M . 

All these bounds brings us to the conclusion that the number of possible 
logs exceeds 



(2fc + 10v / fc) ri C M fc M / 2 2 M / 2 (1.5) 



a; 



Comparing with the number of evaluations of a sequence (rj) we get in- 
equality 

(10v^) M < (2fc + 10v^)" C M fc M/2 2 M / 2 (1.5) M . 
Asymptotically, Catalan numbers grow as C n ~ n z% y r^ which implies that 

(10v^) M < (2/fc + 10v / fc) n ^=fc M / 2 2 M / 2 (1.5) M . 

The right hand side is o((10\/A;) M ) therefore for large enough M the 
inequality can not hold. We get a contradiction, from which we conclude 
that for some specific choices of r±, r2, . . . the algorithm stops. □ 

The above proof can be applied in a more general setting. 
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Theorem 2. Let K be a fixed set ofk positive integers. Then for every n ^ 1 
and for any sequence of sets L\, . . . ,L n of size at least 2fc + 10\/fc each, there 
exists a sequence S = si . . . s n with S{ G Li for all i = 1,2, ... , n which is 
nonrepetitive on every arithmetic progressions whose common difference is 
in K . 

Note that in the proof of the Theorem 1 we focused only on the number of 
forbidden substructures, not their values. Given an arbitrary set of common 
differences we order and numerate them from 1 up to k. We can repeat the 
above reasoning with just one change - the sequence D consists of elements 
of K (but there are stil k of them). 

3. A RELATED GEOMETRIC PROBLEM 

As stated in the introduction, the problem of finding sequences nonrepet- 
itive up to mod k has its origin in a geometric problem of nonrepetitive 
coloring of pints in the plane. We can apply our proof technique to a more 
general question in this setting. The following problem concerning nonrepet- 
itive colorings of discrete sets of points in W 1 was considered in [9]. Let P be 
a discrete set of points and let L be a fixed set of lines in W 1 . A coloring of P 
is nonrepetitive (with respect to L) if each line in L is colored nonrepetitively 
(i.e., no sequence of consecutive points on any I G L forms a repetition). For 
a point p G P let i(p) denote the number of lines from L incident with p 
and let / = I(P, L) = max{i(p) : p G P} be the maximum incidence of the 
configuration (P,L). Using the Lovasz Local Lemma it was proved in [9] 
that / e ( 8/2 + 8/ - 4 )/( / ~ 1 ) 2 colors are sufficient to get such a coloring. Adopting 
the proof of Theorem 2 we can get a better bound. 

Theorem 3. Let (P, L) be a configuration of points and lines in 1" with 
finite maximum incidence L > 2. If C ^ 21 + 10 V% then there is a non- 
repetitive C-coloring of P with respect to L. 

Proof. The argument is pretty much the same as in the proof of Theorem 2. 
We provide an algorithm for which each point is colored at random by one of 
21 + 10vT colors. Fix any linear ordering of all points in P. We color them 
in this order using Algorithm 1, where arithmetic progressions are changed 
into lines in W 1 . Similarly, for a given point p G P and every line I G L such 
that p G / we forbid to use colors already assigned to / points preceding 
and following p on /. This gives us at most 21 forbidden colors for each 
point. So, by analogy to the previous proof, one can show that additional 
10vT colors suffice to get a nonrepetitive coloring of P with respect to L. 
For a log (R, D, O, P, S) we take the same objects as in last case, with the 
exception that now D keeps the information about the line for which we 
get a repetition (values between 1 and I), and S is a sequence of numbers 
between and /. Then all calculations run similarly as before. □ 

4. An open problem 

We would like to conclude the paper with a problem concerning infinite 
sets of forbidden differences. Let K be a fixed (possibly infinite) set of posi- 
tive integers. A coloring of the integers is K -nonrepetitive if every arithmetic 
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progression with common difference in K forms a nonrepetitive sequence. 
Denote by tt(K) the minimum number of colors (possibly infinite) needed 
for a K- nonrepetitive coloring of Z. 

A natural question is for which sets K the number ir{K) is finite. An 
obvious necessary condition is that the related integer distance graph (i.e., 
a graph on the set of vertices Z with two integers a > b joined by an 
edge whenever their difference a — b is in K) has finite chromatic number, 
denoted by x{K)- Theorem 2 shows that tt(K) is finite for finite sets K. 
More intriguing in this respect is the case of infinite sets K. We offer the 
following conjecture in the spirit of Erdos. 

Conjecture 2. tt(K ) is finite for every lacunary set K. 

A set K = {k\ < ^2 < . . .} is lacunary if there is a real number 5 > such 
that > 1 + 5 for all indices i. For instance the set of powers of 2 and 
the set of Fibonacci numbers are lacunary. It is known that for such sets 
the usual chromatic number x{K) is finite [13], [16], [17] . However, there 
are non-lacunary sets with a finite chromatic number. Complete character- 
ization of such sets is not known and, as pointed out by Ruzsa (personal 
communication), this problem is connected to some deep questions in ad- 
ditive number theory. A trivial example is the set of odd positive integers, 
whose chromatic number is 2. Curiously, for the nonrepetitive variant just 
4 colors suffice as proved by Carpi [6], which supports even stronger suppo- 
sition that perhaps n{K) is finite if and only if x(-^0 is finite. 
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